Random Matrix Theory
Table of Contents
1. Randomized Matrix Multiplication
In order to compute the product of two large \( m\times n \) matrix \( A \) and \( n \times l \) matrix \( B \), the \( j \)th pair of \( j \)th column vector in \( A \) and the \( j \)th row vector in \( B \) is sampled randomly, and then summed up.
Start by noticing that \[ AB = \sum_{j=1}^n a_jb^{\top}_j \] where \( a_j \) is the column vector of \( A \), \( b^{\top}_j \) is the row vector of \( B\).
The approximation of the matrix multiplication may be given by the sum of samples \( a_jb^{\top}_j \) with \( j \) selected uniformly: \[ \widehat{AB} =\frac{n}{s} \sum_{i=1}^s a_{j(i)}b^{\top}{}_{j(i)} = \sum_{i=1}^s \frac{na_{j(i)}b^{\top}_{j(i)}}{s} \] where \( s \) is the number of samples. Notice that the sample needs to be normalized by the factor of \( n/s \).
The approximation formula can be further improved by changing the probability \( P_j \) of choosing the \( j \)th pair, reducing the variance of \( \widehat{AB} \). In order to keep the mean, the sample normalization has to change as follows \[ \frac{na_jb^{\top}{}_j}{s \cdot nP_j} = \frac{a_jb^{\top}_j}{sP_j} \] and the approximation formula changes accordingly \[ \widehat{AB} = \sum_{i=1}^s \frac{a_{j(i)}b^{\top}_{j(i)}}{sP_{j(i)}} \] so that \[ \mathrm{E}[\widehat{AB}] = \sum_{i=1}^s \mathrm{E}\left[ \frac{a_{j(i)}b^{\top}_{j(i)}}{sP_{j(i)}} \right] = \sum_{i=1}^s\sum_{j=1}^n \frac{a_{j}b^{\top}_{j}}{s} = AB. \] For the record, the estimator can also be written with a diagonal sampling matrix \( S \) \[ \widehat{AB} = ASB, \quad S_{jj} := \frac{1}{sP_j}. \]
Now, let us find the probability distribution \( P_{j} \) that minimizes the variance \[ \mathrm{Var}[\widehat{AB}] := \mathrm{E} \left[ \| \widehat{AB} - AB \|_F^2 \right] \] where \( \| \cdot \|_F \) is the Frobenius norm. By the independence of each samples and the bilinearity of the Frobenius inner product
\begin{align*} \mathrm{Var}[\widehat{AB}] &= s\mathrm{Var} \left[ \frac{a_jb^{\top}_j}{sP_j} \right] \\ &= s \left( \mathrm{E} \left[ \left\|\frac{a_jb^{\top}_{j}}{sP_j} \right\|_F^2 \right] - \left\| \mathrm{E} \left[ \frac{a_jb^{\top}_j}{sP_j} \right] \right\|_F^2 \right)\\ &= s \left[ \sum_{j=1}^nP_j \left( \sum_{i,k} \frac{a_{ij}^2b_{jk}^2}{s^2P_j^2} \right) - \left\|\sum_{j=1}^nP_j \frac{a_j b^{\top}_j}{sP_j}\right\|_F^2 \right] \\ &= s \left[ \sum_{j=1}^n \frac{\|a_j\|^2 \| b^{\top}_j\|^2}{s^2P_j} - \frac{1}{s^2}\|AB\|_F^2 \right] \\ &= \sum_{j=1}^n \frac{\|a_j\|^2 \| b^{\top}_j\|^2}{sP_j} - \frac{1}{s}\|AB\|_F^2. \\ \end{align*}Next, using the method of Lagrange multipliers under the constraint that \[ \sum_{j=1}^nP_j = 1 \] we can easily obtain \[ P_j = \frac{\|a_j\|\|b^{\top}_j\|}{Z}, \quad Z := \sum_{j=1}^n \|a_j\|\|b^{\top}_j\|. \]
2. Freivalds' Algorithm
3. Ensenble
Set of matrices obeying the constaints of the problem, and whose entries follows certain probability distribution. Often the entries are independent and identically distributed (i.i.d.).
3.1. Ensenble Average
Any function of \( n\times n \) matrix can be averages over a given ensenble \[ \langle f(M) \rangle := \int dM\,f(M)p(M). \] Basis-invariant quantities, including the determinant and the trace of \( k \)th power of a matrix, are of particular interest.
The matrices of different sizes but whose entries are drawn from an identical distribution shows the same eigenvalue spectrum when scaled down by \( n \).
4. Eigenvalue Spacing
The probability density function of the spacing between uncorrelated eigenvalues is \[ P(s) = ce^{-cs}. \]
For certain kind of correlated eigenvalues the probability distribution can be given as \[ P(s) \propto s e^{-cs^2/2}. \]
5. References
- Random Matrices in Unexpected Places: Atomic Nuclei, Chaotic Billiards, Riema…
- Lecture 13: Randomized Matrix Multiplication - YouTube
- Gilbert Strang. Linear Algebra and Learning from Data. 2019.